The Freezing of Random RNA 
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We study secondary structures of random RNA molecules by means of a renormalized field theory based 
on an expansion in the sequence disorder. We show that there is a continuous phase transition from a molten 
phase at higher temperatures to a low-temperature glass phase. The primary freezing occurs above the critical 
temperature, with local islands of stable folds forming within the molten phase. The size of these islands defines 
the correlation length of the transition. Our results include critical exponents at the transition and in the glass 
phase. 
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RNA has various important functions in the cell, it forms 
viral genomes, and has been attributed a key role in the ori- 
gin of life. RNA molecules fold into unique compact con- 
figurations able to perform catalytic functions, and they can 
act as templates for the readout of sequence information. In 
this sense, they are nature's compromise between DNA and 
proteins, which explains their likely role in early evolution as 
well as their ubiquity in today's molecular biology. Typical 
RNA folds at room temperature consist of stems (i.e., parts 
of the molecule forming a helical double strand stabilized by 
Watson-Crick base pairing) linked by loops (i.e., stretches of 
unpaired monomers). These conformations are governed by 
the energies of base pairing and backbone bending as well as 
by the entropy of the loops; their statistical physics is quite 
complicated. Yet, the problem is more tractable than protein 
folding since the free energy of an RNA fold can be separated 
energetically into that of its secondary and its tertiary struc- 
ture llll|2||. Labeling the bases consecutively along the back- 
bone of the molecule from 1 to Lq. the secondary structure 
of the fold is completely defined by the Watson-Crick pairs 
{s:t) {1 < s < t < Lq) subject to the constraint that differ- 
ent pairs are either independent (s < t < s' < t') or nested 
(s < s' < t' < t); see fig. 1. Thus, the secondary struc- 
ture contains purely "topological" information about the fold, 
which is independent of the spatial configuration. Due to the 
constraint on base pairings, secondary structures can always 
be represented by planar diagrams as shown in fig. 1. The 
interactions satisfying this constraint are often the dominant 
part of the free energy, so the secondary structure of a fold 
can be determined self-consistently. There are efficient algo- 
rithms to compute the exact partition function of secondary 
structures for a given sequence |3, 4]. Base pairings violating 
the constraint (so-called pseudoknots) as well as additional in- 
teractions between paired bases are important for the tertiary 
structure of the molecule (i.e., the full spatial arrangement of 
stems and loops) but they generate onW small-scale reaiTange- 
ments of the secondary structure While this separation 

of energies is only approximate, it can be tuned experimen- 
tally by varying salt concentrations in the solution yj. Hence, 
a theory of secondary structures is an important starting point 
for understanding RNA conformations. 
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FIG. 1: Secondary structures of a random RNA molecule at distant 
times. Base pairings can be nested, such as (s, t) and (s', t'), or in- 
dependent, such as (s, t) and (s" ,t"). The pairing overlap is defined 
by the common base pairings between the left and right configura- 
tion (the corresponding bases are shown in black), (a) Above Tc, the 
molecule contains conserved subfolds on scales up to the correlation 
length ^ (marked by shading) and is molten on larger scales, (b) Be- 
low Tc, the molecule is locked into its minimum energy structure on 
all scales, up to rare fluctuations (unshaded). 



The simplest class of such molecules is homopolymers, 
where all Watson-Crick pairs (s, t) contribute an equal 
amount / of free energy. At room temperature, where / is 
typically of order /cbT, homopolymers have a molten phase of 
compact stem-loop folds. The fold of an individual polymer 
in the molten phase is not unique. It changes over time since 
thermal fluctuations continuously build and undo its stems. 
The pairing probability of two bases decays as a power law 
of their backbone distance |7], (t — s)^''°, with po — 3/2. In 
a heteropolymer, the energy of a Watson-Crick pair (s, t) de- 
pends on the nucleotides at the backbone positions paired. An 
important class is random heteropolymers. In biological sys- 
tems, such sequences result from evolution by neutral 1 8| mu- 
tations. For functional RNA, sequences and conformations are 
further modified by selection, but random sequences remain 
important as reference statistics. A well-known analytical de- 
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scription of this case is to approximate the pairing free ener- 
gies ri{s,t) as independent Gaussian random variables given 
by 



r]{s, t) = /, 77(5, t)rj{s', t')-f = aH{s-s')S{t-t'), (1) 



where / and a are of order fcp 
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Free energy-estimates 
and numerical sim- 
ulations 

ram Hill I 

Il6l llTl l indicate that RNA random 
heteropolymers undergo a transition at a critical temperature 
Tc (about room temperature) from the molten phase to a low- 
temperature glass phase. The nature of this phase is contro- 
versial 1 1 d flsiflTll . and the numerical studies may suffer from 
significant finite-size effects fisil . The two phases can be dis- 
tinguished by disorder-induced replica correlations. Replicas 
are simply two secondary structures at distant times - i.e., 
drawn independently from the thermal ensemble - of the same 
RNA molecule, i.e., the same disorder configuration 77(5, t) as 
shown in fig. 1. Correlations between replicas are defined by 
subsequent av eraging over the disorder distribution Q. The 
arguments of fll','l2'l for the pairing overlap (defined as the 
joint probability of two bases being paired in both replicas) 
suggest that replicas become independent at large backbone 
distances in the molten phase but are essentially locked into a 
single conformation in the glass phase. 

In this letter, we develop a systematic field theory of ran- 
dom RNA secondary structures. This theory has two ba- 
sic fields. The contact field <&(s,t) is defined to be 1 if 
the bases s and t are paired and otherwise. The overlap 
field between two replicas a and /?, defined as ^'q/3(s,<) = 
<i>a(s, i)$^(s, t), describes correlations between the replicas. 

By means of the height field h{r) = X)f=r+i ^{^i 0' 

any secondary structure can be mapped onto a random walk 
h{r) {r — 0, . . . , Lq) with step size h{r) — h{r — 1) = ±1 
and boundary conditions h{0) = h{Lo) = 0. This map- 
ping relates random RNA folds to the simpler problems of 
directed polymers in a disordered medium [19,1 and Kardar- 
Parisi-Zhang surface grow Qi j^'. Generalizing existing 
field theoretic approaches f22', "23', "24"], we derive renormal- 
ization equations for the two fundamental variables of the the- 
ory, the disorder strength and the backbone length. The large- 
distance scaling of pairing probability and replica overlap are 
given by the disorder-averaged expectation values 



(2) 

Here pa = 3/2 and = 3 are the known exponents of the 
molten phase At Tc, our renormalization group gives 

first-order values p* — 9* « 11/8. As will become clear 
below, the equality p* = 9* is an exact (though not rigor- 
ous) conclusion beyond first order provided the renormaliza- 
tion group scenario sketched in fig. 3 is qualitatively correct, 
i.e., the true exponents are monotonic in p at fixed e. This 
equality implies that two replicas are essentially locked into a 
single conformation already at the transition. Hence, the lead- 
ing scaling is given by the minimum-energy configuration for 



all temperatures T < Tc, i.e., the exponents 9* = p* govern 
the glass phase as well. The height fluctuations 
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(T > Tc) 
(T < Tc) 



(3) 



with C* « 5/8 are linked to the contact correlations by 
the exact scaling relation C + P = 2 in all phases, which 
follows from the continuum representation of the h field, 
h(r) — ds dt $(s, t) lEsll . and has been obtained pre- 
viously in a closely related context |26]. These exponents 
agree well with the numerical values Cgiass = 0.65 II21 11311 
and pgiass = 1.3(4) (T?,^ for T - 0. 

Our results show that the glass transition is of second order. 
A singular length scale 



\T-Tc 



(4) 



whose exponent i'* = 1/(2 — 6'*) « 8/5 is determined by hy- 
perscaling, describes the crossover scaling above and below 
the critical point. The resulting freezing scenario of random 
RNA molecules is quite intricate. It is illustrated in fig. 1, 
where we show snapshots of the same molecule at two dis- 
tant times for two different temperatures. Above Tc, the cor- 
relations Q, (|3jl scale with their critical exponents p* ,9* , C* 
up to backbone distances (< — s) resp. |r — r'| of order ^. 
Hence, an RNA fold has essentially frozen "islands" of size ^ 
(i.e., its replicas are locked) but is molten on larger scales (its 
replicas become independent), see fig. 1(a). As T approaches 
Tc from above, the replica correlation length ^ increases ac- 
cording to 0, and the turnover time between conformations 
by thermal fluctuations grows. We call this process primary 
freezing. At criticality, there is still a power law distribution 
of rare thermal fiuctuations as discussed below. Lowering the 
temperature below Tc, the correlation length decreases again 
and even these rare fluctuations are removed from larger to 
smaller scales; this is called secondary freezing, see fig. 1(b). 

To derive our renormalization group, we write the sec- 
ondary structure partition function of a given heteropolymer 
as a sum over the contact field configurations. 



(5) 



l<s<t<Lo 



and study the disorder-averaged free energy T = 
—p^^Tr,^ log Z[i]] obtained from the distribution ([ij. In 
the replica formalism, this leads to a system of p interact- 
ing homopolymers, Z^^^ = S<i>i <i> 

exp(-/37i:(P)), whose 

Hamiltonian 1 1 1.. . 121 



a^l3 s<t 



s<t 



is given in terms of the contact fields (1 < < p) and 
the overlap fields '^ap ^ ct, /3 < p, a ^ P) with the cou- 
pling constants fo ~ f — (icr"^ and 50 = /^o"^- The renor- 
malization of this theory is based on analytic continuation in 
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the homopolymer exponent po> or equivalent! y, in the scaUng 
dimension e :— 2pQ — 2 of the coupHng constant go |2^ . In 
the Hmit p ^ 0, the free energy JF^p^ — ~(3~^ logZ^^'' re- 
produces that of the random system, = limp^g JF^^*) /p. 

The noninteracting theory (50 = 0) describes homopoly- 
mers in the molten phase and is exactly solvable in the con- 
tinuum limit, i.e., for molecules of backbone length Ln_p> 1. 
The free energy for closed rings is Tq — p po logLo l7l ll(ll . 
The correlation function of N contact fields ^a{si,ti) de- 
scribes constrained configurations of the molecule with N 
fixed pairings {si, U) {i — 1, . . . , N). These pairings gen- 
erate N + \ subrings of backbone lengths £1, . . . ,1^, ^n+i = 
Lq — J2f=i ^j- Since the secondary structure fluctuations in 
the subrings are independent, this correlation takes the factor- 
ized form 



-po 



($a(si,ti) . . .$a(sAr,iAr))o 



(,-pa 



T -po 

^0 



(7) 



Overlap correlations factorize further into the contribu- 
tions of the single replicas upon insertion of the definition 

In the presence of interactions, we write the free energy as 
a perturbation series. 



p(p-i) 
2 



90 I (*a/3(si,tl))o 
0<si<ti<Lo 



+50 /o<si<ti<S2<t2<Lo ((*a/3(si,il)^'a/3(s2,t2))S 
JorO<si<S2<t2<tl<io ^ 

+2(p~2)(vE'„^(si,ti)*„^(s2,i2)>S)] +Oigl). (8) 

This series contains connected overlap correlations evaluated 
at 5o = 0. The first-order term involves two, the second- 
order terms involve two and three pairwise different replicas, 
respectively; see fig. 2(a)-(c). The integration over the con- 
tact points in produces a singular dependence of the free 
energy on as well as ultraviolet-divergent terms which are 
regular in g^. Performing these integrals and expanding about 
the point of marginality (e = 0), we obtain the leading singu- 
lar part 



(p_ 1)!^ _ (P- l)CpWo 



logio 
+O(£,uoe",wo/e."o)] 



2e2 



(9) 



with the dimensionless coupling constant uo = 50^0 ^ and 
Cp = 1 — 2(p — 2). The poles in (|9} are absorbed into a 
renormalized coupling g = Zggo and a renormalized back- 
bone length L = ZlLq, such that the free energy becomes an 
analytic function of the dimensionless coupling u — gL^^ . 
In a minimal subtraction scheme, we extract from (|9j these 
Z-factors to leading order, 

Z3 = l-CpJ+0(u2), = l-(p-l)^ + 0(w2). (10) 






FIG. 2: Overlap correlations in the series {SJ^la) Two-replica one- 
point function ('I'a/3(si, ti))o. (b) Two-replica two-point function 
ai3{s\,t\)'^ ai3{s2,t2))o- (c) Thiee-replica two-point function 

(*a,a(si,tl)*c«7(s2, t2))o- 



The resulting renormalization group flow takes a simple form 
with respect to the renormalized scale L, 

Piu) = L-^u = -eu + CpU^ + Oiu^), (11) 

^^{u) = ^^Lo = l + {p-l)u + 0{u^). (12) 

The beta function is defined as the flow with respect to the 
original scale Lo, 



d 

I3{u) = LotttU ■ 

OLo 



(3{u) 



-eu + Cpu^ + 0{u^) 



-1l{u) I + {p - l)u + 0{v?)' 



(13) 

It has a nontrivial fixed point u* = e/Cp + O(e^) for generic 
p, which is ultraviolet-unstable for e > and marks the RNA 
glass transition for e — \, p Q. The e-expansion can 
be analyzed at higher orders using the operator product ex- 
pansion of the fields $ and Vf. Generalizing the arguments 
of I22IE4I1 . we find that the theory is renormalizable in g and 
L (for details, see |29]). The field $ is renormalized by a fac- 
tor Z$ = Z^^+0(it^) By the scaHng relation C+P ^ % 
this implies "superdiffusive" height fluctuations with expo- 
nent C* = Co/72 + 0(e2) forp < 1, where 7^ = 7^(1**) |3c|. 
The renormalization of \1/ is tied to that of its conjugate cou- 
pling g. Hence, the dimensions of $ and at the transition 
are two independent exponents. 



P = 



Po + LdL log Z.I, _ 1 + e/2 + 2(p - l)e/Cp 



Ihiu*) 



l + {p- l)e/Cp 

e 



l + {p- \)elCp 



(14) 



the omitted terms are of order p — 2 and . These expressions 
are valid within the constraints Q* > p* , since two-replica 
overlap correlations decay at least as fast as single-replica 
ones, and C* > Co- The resulting dependence of the criti- 
cal exponents on p and e is shown in fig. 3. (a) For p = 2, we 
have shown that the theory is one-loop renormalizable, i.e., 
the expressions to ( I13t and for 9* are exact jE^. 
This reflects the exact summability of the partition function 
as shown in |12] for e = \. We have generalized this solu- 
tion at the transition point to arbitrary e, giving C* — Co and 
p* = Po (the renormalization group results are subleading). 
For e = 1, we thus have 9* = p* — 3/2. Hence, two repli- 
cas are essentially locked into a single conformation already 
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FIG. 3: The critical exponents p* and 9* as a function of e for 
(a) p = 2, (b) p = 1, and (c) p = 0. Exact results (thick solid), 
renormalization group results valid to all orders (thick dashed) or 
to first order (thin dashed), presumably exact values (see text, thin 
solid), reference line po(e) (dotted). 

at the transition. The borderhne value = 1 corresponds 
to the upper critical dimension due = 4 of directed polymers 
(b) For p = 1, renormalization gives p* = po ex- 
actly to all orders, and 9* = 2 — e + O(e^). This produces 
a borderline value £c ~ 2/3, beyond which 9* = p* = po 
exactly, (c) For p = 0, the first-order eq. il4\ produces an 
even smaller value of Ec- For e = 1, we find locked config- 
urations with 9* = p* — 2 — (* « 11/8 as reported above. 
For e > Ec, the renormalization-group exponent 9* in (I14> 
describes a subleading singularity in the overlap correlations, 
which is related to rare critical fluctuations within the locked 
state |29|, cf. |32J for directed polymers. 

Despite its technical difficulties, our renormalization is 
rather intuitive since it acts directly on the fold configurations 
of Fig. 1. In a Wilson scheme, we would produce coarse- 
grained folds with varying short-distance cutoff ^min by inte- 
grating out subconfigurations of backbone length t < iaiin- 
This leads to a scale-dependent backbone length L and cou- 
pling constant g. For p > 1, the attractive replica interac- 
tion produces additional short loops, which are cut off under 
coarse-graining, i.e., the effective length is shorter than with- 
out interaction (L ^ Lq with 7^ > 1). For p < I, how- 
ever, this effect is reversed (7^ < 1): L becomes longer and 
the random walk h{r) becomes correlated with superdiffusive 
fluctuations (C* = 1/2, 7^ > 1/2). Hence, the probability 
of first return is shifted from small to large scales, i.e. there 
are more pairings between distant nucleotides (p* < po). The 
locking of pairing correlations (9* — p*) at criticality means 
that disorder has already its maximal effect on scaling, i.e., 
the same exponents govern the glass phase. This prediction 
is remarkable in contrast to random directed polymers, where 
the roughening transition has no locking for 2 < d < d^c 
and the low-temperature physics is governed by a new strong- 



coupling fixed point. 

We thank R. Bundschuh and F. David for discussions. 
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